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Empirical Covariance Modeling for 21 cm Power Spectrum Estimation: A Method 
Demonstration and New Limits from Early Murchison Widefield Array 128-Tile Data 
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The separation of the faint cosmological background signal from bright astrophysical foregrounds 
remains one of the most daunting challenges of mapping the high-redshift intergalactic medium 
with the redshifted 21cm line of neutral hydrogen. Advances in mapping and modeling of diffuse 
and point source foregrounds have improved subtraction accuracy, but no subtraction scheme is 
perfect. Precisely quantifying the errors and error correlations due to missubtracted foregrounds 
allows for both the rigorous analysis of the 21 cm power spectrum and for the maximal isolation of 
the “EoR window” from foreground contamination. We present a method to infer the covariance of 
foreground residuals from the data itself in contrast to previous attempts at a priori modeling. We 
demonstrate our method by setting limits on the power spectrum using a 3h integration from the 
128-tile Murchison Widefield Array. Observing between 167 and 198 MHz, we find at 95% confidence 
a best limit of A®(fc) < 3.7 x 10® mK® at comoving scale k = 0A8h Mpc“® and at z = 6.8, consistent 
with existing limits. 

PACS numbers: 95.75.-z, 95.75.Kk, 98.80.-k, 98.80.Es 


I. INTRODUCTION 

Tomographic mapping of neutral hydrogen using its 
21 cm hyperfine transition has the potential to directly 
probe the density, temperature, and ionization of the in¬ 
tergalactic medium (IGM), from redshift 50 (and possibly 
earlier) through the end of reionization at z ^ 6. This 
unprecedented view of the so-called “cosmic dawn” can 
tightly constrain models of the first stars and galaxies 
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and eventually yield an order of magnitude more 
precise test of the standard cosmological model (AGDM) 
than current probes [S]. 

Over the past few years, first generation instruments 
have made considerable progress toward the detection 
of the power spectrum of the 21cm emission during the 
epoch of reionization (EoR). Telescopes such as the Low 
Frequency Array (LOFAR [6]), the Donald C. Backer 
Precision Array for Probing the Epoch of Reionization 
(PAPER [7]), the Giant Metrewave Radio Telescope 
(GMRT [8]), and the Murchison Widefield Array (MWA 
[9Hn]) are now operating, and have begun to set limits on 
the power spectrum. GMRT set some of the earliest lim- 
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its [S] and both PAPER [T^] and the MWA [T3] have pre¬ 
sented upper limits across multiple redshifts using small 
prototype arrays. PAPER has translated its results into 
a constraint on the heating of the IGM by the first gen¬ 
eration of x-ray binaries and miniquasars m and has 
placed the tightest constraints so far on the power spec¬ 
trum m and the thermal history of the IGM HU. 

Despite recent advances, considerable analysis chal¬ 
lenges remain. Extracting the subtle cosmological signal 
from the noise is expected to require thousand hour ob¬ 
servations across a range of redshifts [T7U22] . Even more 
daunting is the fact that the 21cm signal is probably at 
least 4 orders of magnitude dimmer than the astrophysi- 
cal foregrounds—due to synchrotron radiation from both 
our Galaxy and from other galaxies [25105] . 

Recently, simulations and analytical calculations have 
established the existence of a region in cylindrical Fourier 
space—in which three-dimensional (3D) Fourier modes 
k are binned into fey modes along the line of sight and 
k± modes perpendicular to it—called the “EoR window” 
that should be fairly free of foreground contamination 
[221 I22H55] . Observations of the EoR window confirm 
that it is largely foreground-free Ha [21] up to the sensi¬ 
tivity limits of current experiments. The boundary of the 
EoR window is determined by the volume and resolution 
of the observation, the intrinsic spectral structure of the 
foregrounds, and the so-called “wedge.” 

Physically, the wedge arises from the frequency depen¬ 
dence of the point spread function (PSF) of any interfer¬ 
ometer, which can create spectral structure from spec¬ 
trally smooth foregrounds in our 3D maps (see [35] for 
a rigorous derivation). Fortunately, in k\\-k± space, in¬ 
strumental chromaticity from flat-spectrum sources is re¬ 
stricted to the region below 
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where Dh = c/Hq, E{z) = and 

Dm{z) = Az'/E{z') with cosmological parameters 

from m- The size of the region is determined by Oq, the 
angle from zenith beyond which foregrounds do not sig¬ 
nificantly contribute. While most of the foreground emis¬ 
sion we observe should appear inside the main lobe of the 
primary beam, foreground contamination from sources in 
the sidelobes are also significant compared to the signal 
[38l [39] . A conservative choice of 6o is therefore tt/2, 
which reflects the fact that the maximum possible delay 
a baseline can measure corresponds to a source at the 
horizon m- Still, this foreground isolation is not fool¬ 
proof and can be easily corrupted by miscalibration and 
imperfect data reduction. Further, slowly varying spec¬ 
tral modes just outside the wedge are also affected when 
the foreground residuals have spectral structure beyond 
that imprinted by the chromaticity of the interferometer. 

To confidently detect the 21cm EoR power spectrum, 
we need rigorous statistical techniques that incorporate 
models of the cosmological signal, the foregrounds, the 


instrument, the instrumental noise, and the exact map¬ 
making procedure. With this information, one may use 
estimators that preserve as much cosmological informa¬ 
tion as possible and thoroughly propagate errors due to 
noise and foregrounds through the analysis pipeline. 

The development of such statistical techniques has pro¬ 
gressed rapidly over the past few years. The quadratic 
estimator formalism was adapted [40] from previous work 
on the cosmic microwave background [41] and galaxy 
surveys [12]. It was accelerated to meet the data vol¬ 
ume challenges of 21 cm tomography [43] and refined to 
overcome some of the difficulties of working with real 
data m- Further, recent work has shown how to rig¬ 
orously incorporate the interferometric effects that cre¬ 
ate the wedge [35l |36l [44] , though they rely on precision 
instrument modeling, including exact per-frequency and 
per-antenna primary beams and complex gains. A sim¬ 
ilar technique designed for drift-scanning telescopes us¬ 
ing spherical harmonic modes was developed in [451146] , 
which also demonstrated the need for a precise under¬ 
standing of one’s instrument. 

However, at this early stage in the development of 
21 cm cosmology, precision instrument characterization 
remains an active area of research [571150] . We thus pur¬ 
sue a more cautious approach to foreground modeling 
that reflects our incomplete knowledge of the instrument 
by modeling the residual foreground covariance from the 
data itself. As we will show, this mitigates systematics 
such as calibration errors that would otherwise impart 
spectral structure onto the foregrounds, corrupting the 
EoR window. While not a fully Bayesian approach like 
those of m and [52], our technique discovers both the 
statistics of the foregrounds and the power spectrum from 
the data. Our foreground models are subject to certain 
prior assumptions but are allowed to be data motivated 
in a restricted space. However, by working in the con¬ 
text of the quadratic estimator formalism, we can benefit 
from the computational speedups of [43]. This work is 
meant to build on those techniques and make them more 
easily applied to real and imperfect data. 

This paper is organized into two main parts. In Sec¬ 
tion [H] we discuss the problem of covariance modeling in 
the context of power spectrum estimation and present a 
method for the empirical estimation of that foreground 
model, using MWA data to illustrate the procedure. 
Then, in Section HI we explain how these data were 
taken and reduced into maps and present the results of 
our power spectrum estimation procedure on a few hours 
of MWA observation, including limits on the 21 cm power 
spectrum. 


II. EMPIRICAL COVARIANCE MODELING 


Before presenting our method of empirically model¬ 
ing the statistics of residual foregrounds in our maps, 
we need to review the importance of these covariances 
to power spectrum estimation. We begin in Section HA 
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with a brief review of the quadratic estimator formalism 
for optimal power spectrum estimation and rigorous er¬ 
ror quantification. We then discuss in Section |II B| the 
problem of covariance modeling in greater detail, high¬ 
lighting exactly which unknowns we are trying to model 
with the data. Next we present in Section |II C| our em¬ 
pirical method of estimating the covariance of foreground 
residuals, illustrated with an application to MWA data. 


Lastly, we review in Section IID the assumptions and 
caveats that we make or inherit from previous power 
spectrum estimation work. 


fast method of [13] which uses fast Fourier transforms and 
Monte Carlo simulations to approximate these matrices. 

Finally, temporally interleaving the input data into two 
cubes Xi and X 2 with the same sky signal but indepen¬ 
dent noise avoids a noise contribution to the bias as in 
[13] . Again following m. we abstain from subtracting a 
foreground residual bias in order to avoid any signal loss 
(as discussed in IIC3). 


B. What Does Our Covariance Model Represent? 


A. Quadratic Power Spectrum Estimator Review 

The fundamental goal of power spectrum estimation 
is to reduce the volume of data by exploiting statistical 
symmetries while retaining as much information as possi¬ 
ble about the cosmological power spectrum m- We seek 
to estimate a set of band powers p using the approxima¬ 
tion that 

^’(k) « 

a. 

where T’(k) is the power spectrum as a function of wave 
vector k and Xa is an indicator function that equals 1 
wherever we are approximating P(k) by Pa and vanishes 
elsewhere. 

Following [111101113], we estimate power spectra from 
a “data cube”—a set of sky maps of brightness temper¬ 
ature at many closely spaced frequencies—which we rep¬ 
resent as a single vector x whose index iterates over both 
position and frequency. From x, we estimate each band 
power as 

Pa = (xi - C~'^C,p (X 2 “ M) “ ba- (3) 

Here fi = (x), the ensemble average of our map over 
many different realizations of the observation, and C is 
the covariance of our map, 

C = (xxT)-(x)(x)T (4) 

C,^ is a matrix that encodes the response of the covari¬ 
ance to changes in the true, underlying band powers; 
roughly speaking, it performs the Fourier transforming, 
squaring, and binning steps one normally associates with 
computing power spectraH Additionally, M is an invert¬ 
ible normalization matrix and be is the power spectrum 
bias from nonsignal contaminants in x. In this work, we 
follow m and choose a form of M such that S = Cov(p) 
is diagonal, decorrelating errors in the power spectrum 
and thus reducing foreground leakage into the EoR win¬ 
dow. In order to calculate M and S quickly, we use the 


^ For a derivation of an explicit form of C,^, see Sol or m- 


Our brightness temperature data cubes are made 
up of contributions from three statistically independent 
sources: the cosmological signal, x'®; the astrophysical 
foregrounds, x^*^; and the instrumental noise x^. It fol¬ 
lows that the covariance matrix is equal to the sum of 
their separate covariances: 

C = C^ + C^° + C^. (5) 

Hidden in the statistical description of the different 
contributions to our measurement is an important sub¬ 
tlety. Each of these components is taken to be a par¬ 
ticular instantiation of a random process, described by 
a mean and covariance. In the case of the cosmologi¬ 
cal signal, it is the underlying statistics—the mean and 
covariance—which encode information about the cosmol¬ 
ogy and astrophysics. However, we can only learn about 
those statistics by assuming statistical isotropy and ho¬ 
mogeneity and by assuming that spatial averages can 
stand in for ensemble averages in large volumes. In the 
case of the instrumental noise, we usually think of the 
particular instantiation of the noise that we see as the 
result of a random trial. 

The foregrounds are different. There is only one set 
of foregrounds, and they are not random. If we knew 
exactly how the foregrounds appear in our observations, 
we would subtract them from our maps and then ignore 
them in this analysis. We know that we do not know 
the foregrounds exactly, and so we choose to model them 
with our best guess, . If we define the cosmological 
signal to consist only of fluctuations from the brightness 
temperature of the global 21cm signal, then the signal 
and the noise both have = 0. Therefore, we 

start our power spectrum estimation using Equation @ 
by subtracting off our best guess as to the foreground 
contamination in our map. But how wrong are we? 

The short answer is that we do not really know that ei¬ 
ther. But, if we want to take advantage of the quadratic 
estimator formalism to give the highest weight to the 
modes we are most confident in, then we must model the 
statistics of our foreground residuals. If we assume that 
our error is drawn from some correlated Gaussian distri¬ 
bution, then we should use that foreground uncertainty 
covariance as the proper in Equation ([^. 

So what do we know about the residual foregrounds in 
our maps? In theory, our dirty maps are related to the 
true sky by a set of point spread functions that depend 
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on both position and frequency |44j . This is the result 
of both the way our interferometer turns the sky into 
measured visibilities and the way we make maps to turn 
those visibilities into x. In other words, there exists some 
matrix of PSFs, P such that 


(x) = Px 


true 


( 6 ) 


The spectral structure in our maps that creates the wedge 
feature in the power spectrum is a result of P. 

We can describe our uncertainty about the true sky— 
about the positions, fluxes, and spectral indices of both 
diffuse foregrounds and points sources—with a covariance 
matrix C^G,true go that 


CFG 


_ p^FGjtruepT 


(7) 


This equation presents us with two ways of modeling the 
foregrounds. If we feel that we know the relationship be¬ 
tween our dirty maps and the true sky precisely, then we 
can propagate our uncertainty about a relatively small 
number of foreground parameters, as discussed by [40] 
and [33], through the P matrix to get This tech¬ 

nique, suggested by [33], relies on precise knowledge of 
P. Of course, the relationship between the true sky and 
our visibility data depends both on the design of our 
instrument and on its calibration. If our calibration is 
very good—if we really understand our antenna gains 
and phases, our primary beams, and our bandpasses— 
then we can accurately model P. 

If we are worried about systematics (and at this early 
stage of 21 cm tomography with low frequency radio in¬ 
terferometers, we certainly are), then we need a comple¬ 
mentary approach to modeling directly, one that we 
can use both for power spectrum estimation and for com¬ 
parison to the results of a more theoretically motivated 
technique. This is the main goal of this work. 


C. Empirical Covariance Modeling Technique 

The idea of using empirically motivated covariance ma¬ 
trices in the quadratic estimator formalism has some his¬ 
tory in the field. Previous MWA power spectrum analy¬ 
sis m used the difference between time-interleaved data 
cubes to estimate the overall level of noise, empirically 
calibrating Tsys, the system temperature of the elements. 
paper’s power spectrum analysis relies on using ob¬ 
served covariances to suppress systematic errors [M] and 
on boot-strapped error bars [I1[T3]. A similar technique 
was developed contemporaneously with this work and 
was used by m to estimate covariances. 

has far more elements than we have measured 
voxels—our cubes have about 2 x 10® voxels, meaning 
that has up to 2 x 10^° unique elements. Therefore, 
any estimate of from the data needs to make some 
assumptions about the structure of the covariance. Since 
foregrounds have intrinsically smooth spectra, and since 
one generally attempts to model and subtract smooth 


spectrum foregrounds, it follows that foreground residu¬ 
als will be highly correlated along the line of sight. Af¬ 
ter all, if we are undersubtracting foregrounds at one 
frequency, we are probably undersubtracting at nearby 
frequencies too. We therefore choose to focus on em¬ 
pirically constructing the part of that corresponds 
to the frequency-frequency covariance—the covariance 
along the line of sight. If there are Uf frequency chan¬ 
nels, then that covariance matrix is only nf xuf elements 
and is likely dominated by a relatively small number of 
modes. 

In this section, we will present an approach to solving 
this problem in a way that faithfully reflects the com¬ 
plex spectral structure introduced by an (imperfectly cal¬ 
ibrated) interferometer on the bright astrophysical fore¬ 
grounds. As a worked example, we use data from a short 
observation with the MWA which we will describe in de¬ 
tail in Section |III| We begin with a uniformly weighted 
map of the sky at each frequency, a model for both point 
sources and diffuse emission imaged from simulated vis¬ 
ibilities, and a model for the noise in each uv cell as a 
function of frequency. 

The idea to model empirically was put forward 
by Liu [S3]. He attempted to model each line of sight as 
statistically independent and made no effort to separate 
from or to reduce the residual noisiness of the 
frequency-frequency covariance. 

Our approach centers on the idea that the covariance 
matrix can be approximated as block diagonal in the uv 
basis of Fourier modes perpendicular to the line of sight. 
In other words, we are looking to express as 

(^uu'vv'ff ~ ^uu'^vv'Cf ( 8 ) 

where A:_l is a function of + v'^. This is the ten¬ 
sor product of our best guess of the frequency-frequency 
covariance C and the identity in both Fourier coordi¬ 
nates perpendicular to the line of sight. In this way, we 
can model different frequency-frequency covariances as 
a function of |m| or equivalently, k±, reflecting that fact 
that the wedge results from greater leakage of power up 
from low fc|| as one goes to higher k±. This method also 
has the advantage that C becomes efficient to both write 
down and invert directly, removing the need for the pre¬ 
conditioned conjugate gradient algorithm employed by 

BSj- 

This approximation is equivalent to the assumption 
that the residuals in every line of sight are statistically 
independent of position. This is generally a pretty ac¬ 
curate assumption as long as the primary beam does 
not change very much over the map from which we es¬ 
timate the power spectrum. However, because Cff>{k±) 
depends on the angular scale, we are still modeling corre¬ 
lations that depend only on the distance between points 
in the map. 

While we might expect that the largest residual voxels 
correspond to errors in subtracting the brightest sources, 
the voxels in the residual data cube (the map minus the 
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model) are only weakly correlated with the best-guess 
model of the foregrounds (we hnd a correlation coefficient 
p — 0.116, which suggests that sources are removed to 
roughly the 10% level, assuming that undersubtraction 
dominates). As we improve our best guess of the model 
foregrounds through better deconvolution, we expect p 
to go down, improving the assumption that foregrounds 
are block diagonal in the uv basis. We will now present 
the technique we have devised in four steps, employing 
MWA data as a method demonstration. 


1. Compute sample covariances in uv annuli 

We begin our empirical covariance calculation by tak¬ 
ing the residual data cubes, defined as 

x™’' = Xi/2 -k X 2/2 - M, (9) 

and performing a discrete Fourier transforir0 at each fre¬ 
quency independently to get x’’®®. This yields n^, x Uy 
sample “lines of sight” {uv cells for all frequencies), as 
many as we have pixels in the map. As a first step to¬ 
ward estimating C, we use the unbiased sample covari¬ 
ance estimator from these residual lines of sight. How¬ 
ever, instead of calculating a single frequency-frequency 
covariance, we want to calculate many different C"'®® ma¬ 
trices to reflect the evolution of spectral structure with 
k±_ along the wedge. We therefore break the uv plane 
into concentric annuli of equal width and calculate C))®^® 
for each uv cell as the sample covariance of the — 2 
lines of sight in that annulus, excluding the cell consid¬ 
ered and its complex conjugate. Since the covariance is 
assumed to be block diagonal, this eliminates a poten¬ 
tial bias that comes from downweighting a uv cell using 
information about that cell. Thus, 

^ {K%’f - (sr>) (*!“■,. - ' 

^uvjf 2^ jyLOS _ 2 — 1 ’ 

other u' ,v' 
in annulus 

( 10 ) 

where (a:™®) is an average over all m' and v' in the annu¬ 
lus. We expect this procedure to be particularly effective 
in our case because the uv coverage of the MWA after 
rotation synthesis is relatively symmetric. 

As a sense check on these covariances, we plot their 
largest 30 eigenvalues in Figure We see that as |m| 
(and thus fcj,) increases, the eigenspectra become shal¬ 
lower. At high fcj_, the effect of the wedge is to leak 
power to a range of fey values. The eigenspectrum of 
intrinsically smooth foregrounds should be declining ex¬ 
ponentially M- The wedge softens that decline. These 


^ For simplicity, we used the unitary discrete Fourier transform 
for these calculations and ignore any factors of length or inverse 
length that might come into these calculations only to be can¬ 
celed out at a later step. 



FIG. 1. The evolution of the wedge with k± motivates us 
to model foregrounds separately for discrete values of k±. In 
this plot of the 30 largest eigenvalues of the observed residual 
covariance (which should include both noise and foregrounds) 
sampled in six concentric annuli, we see steeper declines to¬ 
ward a noise floor for the inner annuli than the outer annuli. 
This is consistent with the expected effect of the wedge— 
higher k± modes should be foreground contaminated at higher 


trends are in line with our expectations and further moti¬ 
vate our strategy of forming covariance matrices for each 
annulus independently. 

Because we seek only to estimate the foreground por¬ 
tion of the covariance, the formal rank deficiency of C)j®„® 
is not a problemj^ All we require is that the largest (and 
thus more foreground-dominated) modes be well mea¬ 
sured. In this analysis, we used six concentric annuli to 
create six different frequency-frequency foreground co- 
variances. Using more annuli allows for better modeling 
of the evolution of the wedge with k± at the expense of 
each estimate being more susceptible to noise and rank 
deficiency. 


2. Subtract the properly projected noise covariance. 

The covariances computed from these uv lines of sight 
include contributions from the 21cm signal and instru¬ 
mental noise as well as foregrounds. We can safely ig¬ 
nore the signal covariance for now as we are far from the 
regime where sample variance is signihcant. We already 
have a theoretically motivated model for the noise (based 
on the uv sampling) that has been empirically rescaled 
to the observed noise in the difference of time-interleaved 
data (the same basic procedure as in |I3j)- We would like 


® In fact, the rank of is A^los — 3 if — 2 < nj. 










6 


an empirical estimate of the residual foreground covari¬ 
ance alone to use in and thus must subtract off the 
part of our measurement we think is due to noise vari¬ 
ance. ^ ^ 

To get to from we subtract our best guess 
of the portion of that is due to noise, which we 
approximate by averaging the noise model variances in all 
the other uv cells in the annulus at that given frequency, 
yielding 

^uvjf = ^LOS X/ (11) 

other u ^v' 
in annulus 

Note, however, that is full rank while C™® is typ¬ 
ically rank deficient. Thus a naive subtraction would 
oversubtract the noise variance in the part of the sub¬ 
space of where C))®® is identically zero. Instead, the 
proper procedure is to find the projection matrices Tluv 
that discard all eigenmodes outside the subspace where 
is full rank. Each should have eigenvalues equal to 
zero or one only and have the property that 

TT pres-rrT _ pres p r)', 

^^uv'^uv‘-‘-uv — '^UV ■ 

Only after projecting out the part of inside the un¬ 
sampled subspace can we self-consistently subtract our 
best guess of the noise contribution to the subspace in 
which we seek to estimate foregrounds. In other words, 
we estimate as 

pFG _ pres _ -pr pN ttT p o\ 

We demonstrate the effectiveness of this technique in 
Figure by plotting the diagonal elements of the Fourier 
transform of and along the line of sight. Sub¬ 
tracting of the noise covariance indeed eliminates the ma¬ 
jority of the power in the noise dominated modes at high 
fc||; thus we expect it also to fare well in the transition 
region near the edge of the wedge where foreground and 
noise contributions are comparable. 

3. Perform a filter on the covariance. 

Despite the relatively clean separation of foreground 
and noise eigenvalues, inspection of some of the 
foreground-dominated modes in the top panel of Figure 

reveals residual noise. Using a foreground covariance 
constructed from these noisy foreground eigenmodes to 
downweight the data during power spectrum estimation 
would errantly downweight some high fey modes in addi¬ 
tion to the low fcy foreground-dominated modes. To avoid 
this double counting of the noise, we allow the foreground 
covariance to include only certain fcy modes by filtering 

in Fourier space to get Put another 

way, we are imposing a prior on which Fourier modes 
we think have foreground power in them. The resulting 


noise filtered eigenmodes are shown in the bottom panel 
of Figure 

In practice, implementing this filter is subtle. We inter¬ 
polate over the flagged frequency channels using a 
cubic spline, then symmetrically pad the covariance ma¬ 
trix, forcing its boundary condition to be periodic. We 
then Fourier transform, filter, inverse Fourier transform, 
remove the padding, and then rezero the flagged chan¬ 
nels. 

Selecting a filter to use is also a subtle choice. We 
first keep modes inside the horizon wedge with an added 
buffer. For each annulus, we calculate a mean value of 
fc_L, and then use Equation Q to calculate the fcy value of 
the horizon wedge, using 9q = 7r/2. Although the litera¬ 
ture suggests a 0.1 to 0.15 /iMpc“^ buffer for “suprahori- 
zon emission” due to some combination of intrinsic spec¬ 
tral structure of foregrounds, primary beam chromatic- 
ity, and finite bandwidth [551ES] , we pick a conservative 
0.5 h Mpc“^. Then we examine the diagonal of (Fig¬ 
ure to identify additional foreground modes, this time 
in the EoR window, due to imperfect bandpass calibra¬ 
tion appearing as spikes. One example is the peak at 
fcy ^ 0.45 h Mpc“^. Such modes contribute errant power 
to the EoR window at constant fcy. Since these modes 
result from the convolution of the foregrounds with our 
instrument, they also should be modeled in in or¬ 

der to minimize their leakage into the rest of the EoR 
window. 

One might be concerned that cosmological signal and 
foregrounds theoretically both appear in the estimate of 
that we have constructed, especially with our con¬ 
servative 0.5/iMpc“^ buffer that allows foregrounds to 
be discovered well into the EoR window. For the pur¬ 
poses of calculating C“^(x — ^l) in the quadratic esti¬ 
mator in Equation , that is fine since its effect is to 
partially relax the assumption that sample variance can 
be ignored. However, the calculation of the bias depends 
on being able to differentiate signal from contaminants 

gni mills]. 

The noise contribution to the bias can be eliminated by 
cross-correlating maps made from interleaved time steps 
m- However, we cannot use our inferred to sub¬ 
tract a foreground bias without signal loss. That said, 
we can still set an upper limit on the 21 cm signal. By 
following the data and allowing the foreground covari¬ 
ance to have power inside the EoR window, we are min¬ 
imizing the leakage of foregrounds into uncontaminated 
regions and we are accurately marking those regions as 
having high variance. As calibration and the control of 
systematic effects improves, we should be able to isolate 
foregrounds to outside the EoR window, impose a more 
aggressive Fourier filter on and make a detection 

of the 21cm signal by employing foreground avoidance. 
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FIG. 2. Examining the diagonal elements of the observed residual and inferred foreground covariance matrices in Fourier space 
reveals the effectiveness of subtracting model for the noise covariance. In red, we plot the observed residual covariance, which 
contains both foregrounds and noise. As a function of fc||, the two separate relatively cleanly—there is a steeply declining 
foreground portion on the left followed by a relatively flat noise floor on the right. The theory that the right-hand portion is 
dominated by noise is borne out by the fact that it so closely matches the observed noise covariance, inferred lines of sight 
of xi — X 2 , which should have only noise and no sky signal at all. The regions where they differ significantly, for example at 
fc|| ~ 0.45/iMpc“^ , are attributable to systematic effects like the MWA’s coarse band structure that have not been perfectly 
calibrated out. For the example covariances shown here (which correspond to a mode in the annulus at k±_ ~ 0.010 h Mpc“^), 
we can see that subtracting a properly projected noise covariance removes most of the power from the noise-dominated region, 
leaving only residual noise that appears both as negative power (open blue circle) and as positive power (closed blue circles) 
at considerably lower magnitude. 


4 . Cut out modes attributable to noise. 

After suppressing the noisiest modes with our Fourier 
filter, we must select a cutoff beyond which the fore¬ 
ground modes are irrecoverably buried under noise. We 
do this by inspecting the eigenspectrum of 
The true by definition, admits only positive eigen¬ 

values (though some of them should be vanishingly 
small). 

By limiting the number of eigenvalues and eigenvectors 
we ultimately associate with foregrounds, we also limit 
the potential for signal loss by allowing a large portion of 
the free parameters to get absorbed into the contaminant 
model [m [55]. When measuring the power spectrum 
inside the EoR window, we can be confident that signal 
loss is minimal compared to foreground bias and other 
errors. 

We plot in Figure the eigenspectra of 
and sorted by absolute value. There are 

two distinct regions—the sharply declining foreground- 
dominated region and a flatter region with many nega¬ 
tive eigenvalues. We excise eigenvectors whose eigenval¬ 
ues are smaller in absolute value than the most negative 
eigenvalue. This incurs a slight risk of retaining a few 
noise dominated modes, albeit strongly suppressed by 
our noise variance subtraction and our Fourier filtering. 
Finally we are able to construct the full covariance C 
using Equation ([^. 


D. Review of Assumptions and Caveats 

Before proceeding to demonstrate the effectiveness of 
our empirical covariance modeling method, it is useful 
to review and summarize the assumptions made about 
mapmaking and covariance modeling. Some are inher¬ 
ited from the previous application of quadratic power 
spectrum estimation to the MWA [T3], while others are 
necessitated by our new, more faithful foreground covari¬ 
ance. Relaxing these assumptions in a computationally 
efficient manner remains a challenge we leave for future 
work. 

i. We adopt the flat sky approximation as in [laiig, al¬ 
lowing us to use the fast Eourier transform to quickly 
compute power spectra. The error incurred from this 
approximation on the power spectrum is expected to 
be smaller than 1% m- 

ii. We assume the expectation value of our uniformly 
weighted map is the true sky (i.e., (x) = x‘''“®) when 
calculating C ,/3 in Equation 1^ again following [13]. 
In general (x) is related to by P, the matrix 
of point spread functions [H]. Here we effectively 
approximate the PSE as position independent. Re¬ 
laxing this approximation necessitates the full map¬ 
making theory presented in |44] which has yet to be 
integrated into a power spectrum estimation pipeline. 

iii. We approximate the foreground covariance as uncor- 
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FIG. 3. The foreground covariance we estimate from our lim¬ 
ited data set is still very noisy, and we run the risk of overfit¬ 
ting the noise in our measurements if we take it at face value. 
In the top panel, we plot the eigenvectors corresponding to 
the five largest eigenvalues of for a mode in the annu¬ 
lus centered on k± « 0.010hMpc“'^. In the bottom panel, 
we show dominant eigenvectors of the Fourier-filtered covari¬ 
ance. As expected, they resemble the hrst five Fourier modes. 
The missing data every 1.28 MHz are due to channels flagged 
at the edge of the coarse bandpass of the MWA’s polyphase 
filter bank—the most difficult part of the band to calibrate. 
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related between different uv cells (and thus block di¬ 
agonal). At some level there likely are correlations 
in uv, though those along the line of sight are far 
stronger. It may be possible to attempt to calculate 
these correlations empirically, but it would be very 
difficult considering relative strength of line-of-sight 
correlations. It may also be possible to use a nonem- 
pirical model, though that has the potential to make 
the computational speedups of [43] more difficult to 
attain. 

iv. We approximate the frequency-frequency foreground 
covariance as constant within each annulus, estimat¬ 
ing our covariance for each uv cell only from other 
cells in the same annulus. In principle, even if the 
foreground residuals were isotropic, there should be 



FIG. 4. The evolution of the eigenvalues of our estimated 
foreground covariance matrix for a mode in the annulus cor¬ 
responding to fcj_ ~ 0.010 ZiMpc”^ at each of the hrst three 
stages of covariance estimation. First we calculate a sam¬ 
ple covariance matrix from the residual data cubes (shown in 
red). Next we subtract our best guess as to the part of the di¬ 
agonal of that matrix that originates from instrumental noise, 
leaving the blue dots (open circles are absolute values of neg¬ 
ative eigenvalues). Then we hlter out modes in Fourier space 
along the line of sight that we think should be noise dom¬ 
inated, leaving the black dots. Finally, we project out the 
eigenvectors associated with eigenvalues whose magnitude is 
smaller than the largest negative eigenvalue, since those are 
likely due to residual noise. What remains is our best guess 
at the foreground covariance in an annulus and incorporates 
as well as possible our prior beliefs about its structure. 


radial evolution within each annulus which we ignore 
for this analysis. 

V. The Fourier filter is a nontrivial data analysis choice 
balancing risk of noise double counting against that 
of insufficiently aggressive foreground downweight¬ 
ing. 

vi. In order to detect the 21 cm signal, we assume that 
foregrounds can be avoided by working within the 
EoR window. Out of fear of losing signal, we make 
no effort to subtract a residual foreground bias from 
the window. This makes a detection inside the wedge 
impossible and it risks confusing foreground contam¬ 
ination in the window for a signal. Only analysis of 
the dependence of the measurement on z, k, /cy, and 
k± can distinguish between systematics and the true 
signal. 


III. RESULTS 

We can now demonstrate the statistical techniques we 
have motivated and developed in Section |TT| on the prob¬ 
lem of estimating power spectra from a 3 h observation 
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with the 128-antenna MWA. We begin with a discussion 
of the instrument and the observations in Section FlII A1 In 
Section pIIB| we detail the data processing from raw vis¬ 
ibilities to calibrated maps from which we estimate both 
the foreground residual covariance matrix and the power 
spectrum. Finally, in Section |III| we present our results 
and discuss lessons learned looking toward a detection of 
the 21 cm signal. 


A. Observation Summary 

The 128-antenna Murchison Widefield Array began 
deep EoR observations in mid-2013. We describe here 
the salient features of the array and refer to |TU] for a 
more detailed description. The antennas are laid out 
over a region of radius 1.5 km in a quasirandom, cen¬ 
trally concentrated distribution which achieves approxi¬ 
mately complete uv coverage at each frequency over sev¬ 
eral hours of rotation synthesis m- Each antenna ele¬ 
ment is a phased array of 16 wideband dipole antennas 
whose phased sum forms a discretely steerable 25° beams 
(full width at half maximum) at 150 MHz with frequency- 
dependent, percent level sidelobes [35]. We repoint the 
beam to our field center on a 30 min cadence to correct 
for earth rotation, effectively acquiring a series of drift 
scans over this field. 

We observe the MWA “EORO” deep integration field, 
centered at R.A.(J2000) = O^’O'^O" and decl.(J2000) 
= —30°0'0". It features a near-zenith position, a high 
Galactic latitude, minimal Galactic emission |58j . and 
an absence of bright extended sources. This last prop¬ 
erty greatly facilitates calibration in comparison to the 
“EOR2” field—a field dominated by the slightly resolved 
radio galaxy Hydra A at its center—which was used by 
[55] and m- A nominal 3 h set of EORO observations 
was selected during the first weeks of observing to use 
for refining and comparing data processing, imaging, and 
power spectra pipelines [60] . In this work, we use the 
“high band,” near-zenith subset of these observations 
with 30.72 MHz of bandwidth and center frequency of 
182 MHz, recorded on Aug 23, 2013 between 16:47:28 and 
19:56:32 UTC (22.712 and 1.872 hours LST). 


B. Calibration and Mapmaking Summary 

Preliminary processing, including radio frequency in¬ 
terference (REI) flagging followed by time and frequency 
averaging, was performed with the COTTER package m 
on raw correlator data. These data were collected at 
40 kHz resolution with an integration time of 0.5 s, and 
averaged to 80 kHz resolution with a 2 s integration time 
to reduce the data volume. Additionally, 80 kHz at the 
upper and lower edges of each of 24 coarse channels (each 
of width 1.28 MHz) of the polyphase filter bank is flagged 
due to known aliasing. 


As in [T3], we undertake snapshot-based processing in 
which each minute-scale integration is calibrated and im¬ 
aged independently. Snapshots are combined only at the 
last step in forming a Stokes I image cube, allowing us 
to properly align and weight them despite different pri¬ 
mary beams due to sky rotation and periodic repointing. 
While sources are forward modeled for calibration and 
foreground subtraction using the full position dependent 
PSF (i.e., the synthesized beam), we continue to approx¬ 
imate it as position independent (and equal to that of 
a point source at the field center) during application of 
uniform weighting and computation of the noise covari¬ 
ance. 

We use the calibration, foreground modeling, and first 
stage image products produced by the Fast Holographic 
Deconvolutior[^(FHD) pipeline as described by [55]. The 
calibration implemented in the FHD package is an adap¬ 
tation of the fast algorithm presented by [63] with a 
baseline cutoff of 6 > 50A. In this data reduction, the 
point source catalogs discussed below are taken as the 
sky model for calibration. Solutions are first obtained 
per antenna and per frequency before being constrained 
to linear phase slopes and quadratic amplitude functions 
after correcting for a median antenna-independent ampli¬ 
tude bandpass. The foreground model used for subtrac¬ 
tion includes models both of diffuse radio emission [53] 
and point sources. In detail, the point source catalog is 
the union of a deep MWA point source survey within 20° 
of the field center [65], the shallower but wider MWA 
commissioning point source survey [55], and the Gulgo- 
ora catalog m- Note that calibration and foreground 
subtraction of off-zenith observations are complicated by 
Galactic emission picked up by primary beam sidelobes, 
and are active topics of investigation [SH] [351ISH] • Dur¬ 
ing these observations a single antenna was flagged due 
to known hardware problems, and 1-5 more were flagged 
for any given snapshot due to poor calibration solutions. 

These calibration, foreground modeling, and imaging 
steps constitute notable improvements over m- In that 
work, the presence of the slightly resolved Hydra A in 
their EOR2 field likely limited calibration and subtrac¬ 
tion fidelity as only a point source sky model was used. 
In contrast, the EORO field analyzed here lacks any such 
nearby radio sources. Our foreground model contains 
~ 2500 point sources within the main lobe and several 
thousand more in the primary beam sidelobes in addition 
to the aforementioned diffuse map. A last improvement 
in the imaging is the more frequent interleaving of time 
steps for the cross power spectrum, which we performed 
at the integration scale (2 s) as opposed to the snapshot 
scale (a few minutes). This ensures that both Xi and 
X 2 have identical sky responses and thus allows us to ac¬ 
curately estimate the noise in the array from difference 
cubes. Assuming that the system temperature contains 


For a theoretical discussion of the algorithm see m- The code 
is available at https://github.coin/iniguelfinorales/FHD 
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both an instrumental noise temperature and a frequency 
dependent sky noise temperature that scales as 
the observed residual root-mean-square brightness tem¬ 
perature is consistent with T^yg ranging from 450 K at 
167MHz to 310K at 198MHz, in line with expectations 

m- 

As discussed in EO] and [^, FHD produces naturally 
weighted sky, foreground model, “weights,” and “vari¬ 
ances” cubes, as well as beam-squared cubes. All are 
saved in image space using the HEALPix format m with 
A^side = 1024. Note that these image cubes are crops of 
full-sky image cubes to a 16° x 16° square field of view, as 
discussed below. The sky, foreground model, and weights 
cubes are image space representations of the measured 
visibilities, model visibilities, and sampling function, re¬ 
spectively, all originally gridded in uv space using the 
primary beam as the gridding kernel. The variances cube 
is similar to the weights cube, except the gridding kernel 
is the square of the uv space primary beam. It repre¬ 
sents the proper quadrature summation of independent 
noise in different visibilities when they contribute to the 
same uv cell, and will ultimately become our diagonal 
noise covariance model. The FHD cubes from all ninety- 
four 112 s snapshots are optimally combined in this “holo¬ 
graphic” frame in which the true sky is weighted by two 
factors of the primary beam, as in m- 

We perform a series of steps to convert the image 
cube output of FHD into uniformly weighted Stokes I 
cubes accompanied by appropriate uv coverage infor¬ 
mation for our noise model. We first map these data 
cubes onto a rectilinear grid, invoking the flat sky ap¬ 
proximation. We do this by rotating the (RA,Dec) 
HEALPix coordinates of the EORO field to the north 
pole (0°,90°), and then projecting and gridding onto the 
xy plane with 0.2° x 0.2° resolution over a 16° x 16° 
square field of view. To reduce the data volume while 
maintaining cosmological sensitivity, we coarse grid to 
approximately 0.5° resolution by Eourier transforming 
and cropping these cubes in the uv plane at each fre¬ 
quency. We form a uniformly weighted Stokes I cube 
by first summing the XX and YY data cubes, re¬ 
sulting in a naturally weighted, holographic stokes I cube 
Ina.t,h{Q) = Ixx,h{d) + lYY,h0). Then we divide out the 
holographic weights cube Wh{0) in uv space, which ap¬ 
plies uniform weighting and removes one image space fac¬ 
tor of the beam, and lastly divide out the second beam 
factor B0)-. 

where B represents a Eourier transform and B{9) = 
-|-Hyy(0)]^/^. Consistent treatment of the vari¬ 
ances cube requires uv space division of two factors of 
the weights cube followed by image space division of two 
factors of the beam. 

Lastly, we frequency average from 80 kHz to 160 
kHz, flagging a single 160 kHz channel the edge of each 
1.28 MHz coarse channel due to polyphase filter bank at¬ 
tenuation and aliasing, which make these channels dif¬ 
ficult to reliably calibrate. Following [T3], we also flag 


poorly observed uv cells and uv cells whose observation 
times vary widely between frequencies. In all cases, we 
formally set the variance in flagged channels and uv cells 
in to infinity and use the pseudoinverse to project 
out flagged modes m- 


C. Power Spectrum Results 

We can now present the results of our method ap¬ 
plied to 3 h of MWA-128T data. We first study cylin- 
drically averaged, two-dimensional (2D) power spectra 
and their statistics, since they are useful for seeing the 
effects of foregrounds and systematic errors on the power 
spectrum. We form these power spectra with the full 
30.72 MHz instrument bandwidth to achieve maximal fc|| 
resolution. 

We begin with the 2D power spectrum itself (Figure 

in which several important features can be observed. 
First, the wedge and EoR window are clearly distinguish¬ 
able, with foregrounds suppressed by at least 5 orders of 
magnitude across most of the EoR window. At high k±, 
the edge of the wedge is set by the horizon while at low 
k± the cutoff is less clear. There appears to be some level 
of suprahorizon emission, which was also observed with 
PAPER in [2S] and further explained by [35]. Consis¬ 
tent with Figure [2 we see the strongest foreground resid¬ 
ual power at low k±, meaning that there still remains 
a very large contribution from diffuse emission from our 
Galaxy—potentially from sidelobes of the primary beam 
affecting the shortest baselines [331 EH] • 

We also see evidence for less-than-ideal behavior. 
Through we identified spectral structure appearing at 
fc|| ~ 0.45/iMpc“^ in Figure and included it in our 
foreground residual covariance, that contamination still 
appears here as a horizontal line. By including it in the 
foreground residual model, we increase the variance we 
associate with those modes and we decrease the leakage 
out of those modes, isolating the effect to only a few fc|| 
bins. 

While Figure l^hows the magnitude of the 2D power 
spectrum, Figurej^shows its sign using a split color scale, 
providing another way to assess foreground contamina¬ 
tion in the EoR window. Because we are taking the cross 
power spectrum between two cubes with identical sky 
signal but independent noise realizations, the noise dom¬ 
inated regions should be positive or negative with equal 
probability. This is made possible by our use of a power 
spectrum estimator normalized such that S = Cov(p) is 
a diagonal matrix [33] . This choice limits leakage of fore¬ 
ground residuals from the wedge into the EoR window 

m- 

By this metric, the EoR window is observed to be 
noise dominated with only two notable exceptions. The 
first is the region just outside the wedge at low k± at¬ 
tributable to suprahorizon emission due to some com¬ 
bination of intrinsic foreground spectral structure, beam 
chromaticity, and finite bandwidth. This suggests our ag- 
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FIG. 5. Our power spectrum clearly exhibits the typical 
EoR window structure with orders-of-magnitude suppression 
of foregrounds in the EoR window. Here we plot our estimates 
for |P(fcj^, fc||)| for the full instrumental bandwidth, equiva¬ 
lent to the range 2 = 6.2 to 2 = 7.5. Overplotted is the wedge 
from Equation[^corresponding to the first null in the primary 
beam (dash-dotted line), the horizon (dashed line), and the 
horizon with a relatively aggressive 0.02 h Mpc“ 1 buffer (solid 
curve). In addition to typical foreground structure, we also 
see the effect of noise at high and very low k± where base¬ 
line coverage is poor. We also clearly see a line of power at 
constant fc|| ~ 0.45/i Mpc“^, attributable to miscalibration of 
the instrument’s bandpass and cable reflections [69]. 


gressive 0.02 ft, Mpc“^ cut beyond the horizon will leave 
in some foreground contamination when we bin to form 
one-dimensional (ID) power spectra. As long as we are 
only claiming an upper limit on the power spectrum, this 
is fine. A detection of foregrounds is also an upper limit 
on the cosmological signal. More subtle is the line of 
positive power at fty ~ 0.45ftMpc“^, confirming our hy¬ 
pothesis that the spike observed in Figure]^ is indeed an 
instrumental systematic since it behaves the same way 
in both time-interleaved data cubes. There is also a hint 
of a similar effect at fty ~ 0.75ftMpc“^, possibly visible 
in Figure as well. We attribute both to bandpass mis¬ 
calibration due to cable reflections, complicated at these 
frequency scales by the imperfect channelization of the 
MWA’s two-stage polyphase filter, as well as slight an¬ 
tenna dependence of the bandpass due to cable length 


FIG. 6. By using an estimator of the power spectrum with 
uncorrelated errors between bins, we can see that most of 
the EoR window is noise dominated in our power spectum 
measurement. Here we show the inverse hyperbolic sine of 
the power spectum, which behaves linearly near zero and log¬ 
arithmically at large magnitudes. Because we are taking a 
cross power spectrum between two data cubes with uncorre¬ 
lated noise, noise dominated regions are equally likely to have 
positive power as negative power. Since we do not attempt to 
subtract a foreground bias, foreground contaminated regions 
show up as strongly positive. That includes the wedge, the 
bandpass line at ft|| ~ 0.45ftMpc“^ (see Eigure|M, and some 
of the EoR window at low k± and relatively low consistent 
with the suprahorizon emission seen in |26| . 

variation |69j . 

Additionally, the quadratic estimator formalism relates 
our covariance models of residual foregrounds and noise 
to the expected variance in each band power [nmniiisi, 
which we plot in Figure As we have chosen our power 
spectrum normalization M such that S = Cov(p) is di¬ 
agonal, it is sufficient to plot the diagonal of the 

standard deviation of each band power. The EoR win¬ 
dow is seen clearly here as well. There is high variance 
at low and high k± where the uv coverage is poor, and 
also in the wedge due to foreground residuals. It is par¬ 
ticularly pronounced in the bottom left corner, which is 
dominated by residual diffuse foregrounds. 

As our error covariance represents the error due to both 
noise and foregrounds we expect to make in an estimate 


































12 



log,^[Error on P(k)] (mK^ h’^Mpc®) 


FIG. 7. By including both residual foregrounds and noise in 
C, our model for the covariance, we can calculate the expected 
variance on each band power in p, which we show here. We see 
more variance at high (and also very low) where we have 
few baselines. We also see high variance at low fc|| consistent 
with foregrounds. We see the strongest foregrounds at low 
k±, which implies that the residual foregrounds have a very 
strong diffuse component that we have much to gain from bet¬ 
ter diffuse models to subtract. We also see that foreground- 
associated variance extends to higher fc|| at high k±, which is 
exactly the expected effect from the wedge. Both these ob¬ 
servations are consistent with the structure of the eigenmodes 
we saw in Figure Because we have chosen a normalization 
of p such that the Cov(p) is diagonal, this is a complete de¬ 
scription of our errors. Furthermore, it means that the band 
powers form a mutually exclusive and collectively exhaustive 
set of measurements. 


of the 21 cm signal, it is interesting to examine the “sig¬ 
nal to error ratio” in Figure [8]— the ratio of Figure to 
Figure [7j The ratio is of order unity in noise dominated 
regions—though it is slightly lower than what we might 
naively expect due to our conservative estimate of sna. 
That explains the number of modes with very small val¬ 
ues in Figurej^ In the wedge and just above it, however, 
the missubtracted foreground bias is clear, appearing as 
a high significance “detection” of the foreground wedge 
in the residual foregrounds. The bandpass miscalibra- 
tion line at fey ^ 0.45/iMpc“^ also appears clearly due 
to both foreground bias and possibly an underestimation 
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FIG. 8. The foregrounds’ wedge structure is particularly clear 
when looking at the ratio of our measured power spectrum to 
the modeled variance, shown here. Though the variance in 
foreground residual dominated parts of the fcx-fc|| plane are 
elevated (see Figure]^, we still expect regions with signal to 
error ratios greater than one. This is largely due to the fact 
that we choose not to subtract a foreground bias for fear of 
signal loss. This figure shows us most clearly where the fore¬ 
grounds are important and, as with Figure]^ it shows where 
we can hope to do better with more integration time and 
where we need better calibration and foreground modeling to 
further integrate down. 


of the errors. Hedging against this concern, we simply 
project out this line from our estimator that bins 2D 
power spectra into ID power spectra by setting the vari¬ 
ance of those bins to infinity. 

Though useful for the careful evaluation of our tech¬ 
niques and of the instrument, the large bandwidth data 
cubes used to make Figures and [^encompass long pe¬ 
riods of cosmic time over which the 21cm power spec¬ 
trum is expected to evolve. The cutoff is usually taken 
to be Az < 0.5 [5]. These large data cubes also violate 
the assumption in [43] that channels of equal width in 
frequency correspond to equal comoving distances, justi¬ 
fying the use of the fast Fourier transform. Therefore, we 
break the full bandwidth into three 10.24 MHz segments 
before forming spherically averaged power spectra, and 
estimate the foreground residual covariance and power 
spectrum independently from each. We bin our 2D power 
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spectra into ID power spectra using the optimal estima¬ 
tor formalism of |13j . In our case, since we have chosen 
M such that S is diagonal, this reduces to simple inverse 
variance weighting with the variance on modes outside 
the EoR window or in the ~ 0.45/iMpc“^ line set to 
infinity. 

In Figure we show the result of that calculation as 
a “dimensionless” power spectra A^(fc) = k^P(k)/2'K^. 
We choose our binning such that the window functions 
(calculated as in |13j from our covariance model) were 
slightly overlapping. 

Our results are largely consistent with noise. Since 
noise is independent of fey and k « /cy for most modes 
we measure, the noise in A^(fc) scales as k^. We see 
deviations from that trend at low k where modes are 
dominated by residual foreground emission beyond the 
horizon wedge and thus show elevated variance and bias 
in comparison to modes at higher k. Since we do not 
subtract a bias, even these “detections” are upper limits 
on the cosmological signal. 

A number of barely significant “detections” are ob¬ 
served at higher k. Though we excise bins associated 
with the fcy ^ 0.45/iMpc“^ line, the slight detections 
may be due to leakage from that line. At higher z, the 
feature may due to reflections from cables of a differ¬ 
ent length, though some may be plausibly attributable 
to noise. Deeper integration is required to investigate 
further. 

Our best upper limit at 95% confidence is A^(fc) < 
3.7 X 10^ mK^ at fc = 0.18Mpc“^ around 2 = 6.8. Our 
absolute lowest limit is about 2 times lower than the best 
limit in [T3], though the latter was obtained at substan¬ 
tially higher redshift and lower k, making the two some¬ 
what incomparable. Our best limit is roughly 3 orders 
of magnitude better than the best limit of [13] over the 
same redshift range, and the overall noise level (as mea¬ 
sured by the part of the power spectrum that scales as 
k^) is more than 2 orders of magnitude smaller. This 
cannot be explained by more antenna tiles alone; it is 
likely that the noise level was overestimated in m due 
to insufficiently rapid time interleaving of the data cubes 
used to infer the overall noise level. 

Although one cannot directly compare limits at dif¬ 
ferent values of k and z, our limit is similar to the 
GMRT limit |8], 6.2 x 10^ mK^ at fc = 0.50Mpc“^ and 
z — 8.6 with 40 h of observation, and remains higher 
than the best PAPER limit [T^ of 502 mK^ between 
k = 0.15/iMpc“^ and k = 0.50/iMpc“^ and z = 8.4 
with 4.5 months of observation. 

In Figure we also plot a theoretical model from 
mi predicting that reionization has ended by the low¬ 
est redshift bin we measure. We remain more than 3 
orders of magnitude (in mK^) from being able to de¬ 
tect that particular reionization model, naively indicating 
that roughly 3000 h of data are required for its detection. 
This appears much larger than what previous sensitivity 
estimates have predicted for the MWA (e.g. m) in the 
case of idealized foreground subtraction. 


However, much of this variance is due to the residual 
foregrounds and systematics in the EoR window identi¬ 
fied by our empirical covariance modeling method, not 
thermal noise (see Figure]^. More integration will not 
improve those modes unless it allows for a better under¬ 
standing of our instrument, better calibration, and better 
foreground models—especially of diffuse emission which 
might contaminate the highly sensitive bottom left corner 
of the EoR window. Eliminating this apparent “supra- 
horizon” emission, seen most clearly as detections in Fig- 
ure|^below k « 0.2 h Mpc“^, is essential to achieving the 
forecast sensitivity of the MWA m- If we can do so, we 
may still be able to detect the EoR with 1000 h or fewer. 
This is especially true if we can improve the subtraction 
of foregrounds to the point where we can work within 
the wedge, which can vastly increase the sensitivity of 
the instrument |551157j . On the other hand, more data 
may reveal more systematics lurking beneath the noise 
which could further diminish our sensitivity. 


IV. SUMMARY AND FUTURE DIRECTIONS 


In this work, we developed and demonstrated a method 
for empirically deriving the covariance of residual fore¬ 
ground contamination, in observations designed to 

measure the 21cm cosmological signal. Understanding 
the statistics of residual foregrounds allows us to use the 
quadratic estimator formalism to quantify the error as¬ 
sociated with missubtracted foregrounds and their leak¬ 
age into the rest of the EoR window. Because of the 
complicated interaction between the instrument and the 
foregrounds, we know that the residual foregrounds will 
have complicated spectral structure, especially if the in¬ 
strument is not perfectly calibrated. By deriving our 
model for empirically, we could capture those effects 
faithfully and thus mitigate the effects of foregrounds in 
our measurement (subject to certain caveats which we 
recounted in Section IID). 


Our strategy originated from the assumption that the 
frequency-frequency covariance, modeled as a function of 
|u|, is the most important component of the foreground 
residual covariance. We therefore used sample covari¬ 
ances taken in annuli in Fourier space as the starting 
point of our covariance model. These models were ad¬ 
justed to avoid double counting the noise variance and 
filtered in Fourier space to minimize the effect of noise 
in the empirically estimated covariances. Put another 
way, we combined our prior beliefs about the structure 
of the residual foregrounds with their observed statistics 
in order to build our models. 

We demonstrated this strategy through the power 
spectrum analysis of a 3h preliminary MWA data set. 
We saw the expected wedge structure in both our power 
spectra and our variances. We saw that most of the 
EoR window was consistent with noise, and we under¬ 
stand why residual foregrounds and systematics affect 
the regions that they do. We were also able to set new 
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I • Even/Odd Cross A^(k) - 20 - Errors and 20%-80% Window Functions — Thermal Noise 2n- Limits -Theoretical A^(k) (Barkana 2009) | 


FIG. 9. Finally, we can set confident limits on the 21cm power spectrnm at three redshifts by splitting our simultaneous 
bandwidth into three 10.24 MHz data cubes. The lowest k bins show the strongest “detections,” though they are attributable 
to suprahorizon emission [26] that we expect to appear because we only cut out the wedge and a small buffer (0.02 h Mpc“^) 
past it. We also see marginal “detections” at higher k which are likely due to subtle bandpass calibration effects like cable 
reflections. The largest such error, which occurs at bins around fe|| ~ 0.45/iMpc“^ and can be seen most clearly in Figure 
has been flagged and removed from all three of these plots. Our absolute lowest limit requires A^(fc) < 3.7 x 10"* mK^ at 
95% confidence at comoving scale k — 0.18/iMpc“^ and 2 = 6.8, which is consistent with published limits |8l I12IH5| . We 
also include a simplistic thermal noise calculation (dashed line), based on our observed system temperature. Though it is not 
directly comparable to our measurements, since it has different window functions, it does show that most of our measurements 
are consistent with thermal noise. For comparison, we also show the theoretical model of [71] (which predicts that reionization 
ends before 2 = 6.4) at the central redshift of each bin. While we are still orders of magnitude away from the fiducial model, 
recall that the noise in the power spectrum scales inversely with the integration time, not the square root. 


MWA limits on the 21 cm power spectrum from z = 6.2 
to 7.5, with an absolute best 95% confidence limit of 
A‘^{k) < 3.7 X 10"^ mK^ at A: = 0.18/iMpc“^ and 2 = 6.8, 
consistent with published limits mini. 

This work suggests a number of avenues for future re¬ 
search. Of course, improved calibration and mapmak¬ 
ing fidelity—especially better maps of diffuse Galactic 
structure—will improve power spectrum estimates and 
and allow deeper integrations without running up against 
foregrounds or systematics. Relaxing some of the map¬ 
making and power spectrum assumptions discussed in 
Section m may further mitigate these effects. A start¬ 
ing point is to integrate the mapmaking and statistical 
techniques of jH] with the fast algorithms of [13] . The 
present work is based on the idea that it is simpler to 
estimate from the data than from models of the in¬ 
strument and the foregrounds. But if we can eliminate 
systematics to the point where we really understand P, 
the relationship between the true sky and our dirty maps, 
then perhaps we can refocus our residual foreground co- 
variance modeling effort on the statistics of the true sky 


residuals using the fact that = pc^'^^ruepx^ Qj^_ 

taining such a complete understanding of the instrument 
will be challenging, but it may be the most rigorous way 
to quantify the errors introduced by missubtracted fore¬ 
grounds and thus to confidently detect the 21 cm power 
spectrum from the epoch of reionization. 
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